International email arises from the combined provision of internationalized domain names ( IDN)[Started with: ] and email address internationalization ( EAI).[Started with: ] The result is email that contains international characters (characters which do not exist in the ASCII character set), encoded as UTF-8, in the email header and in supporting mail transfer protocols. The most significant aspect of this is the allowance of email addresses (also known as email identities) in most of the world's writing systems, at both interface and transport levels.
Email addresses
Traditional email addresses are limited to characters from the
English alphabet and a few other special characters.
[RFC 5322: Internet Message Format]
The following are valid traditional email addresses:
stellyamburrr985@example.com (English, ASCII)
Abc.123@example.com (English, ASCII)
user+mailbox/department=shipping@example.com (English, ASCII)
!#$%&'*+-/=?^_`.{|}~@example.com (English, ASCII)
"Abc@def"@example.com (English, ASCII)
"Fred\ Bloggs"@example.com (English, ASCII)
"Joe.\\Blow"@example.com (English, ASCII)
A cyrillic languaged people might wish to use i.e. бацка.махно as their identifier but be forced to use a transcription such as batska.makno@example.org or even some other completely unrelated identifier instead. The same is true of Chinese, Japanese, and other nationalities that do not use , but also applies to users from non-English-speaking European countries whose desired addresses might contain (e.g. André or Płużyna). As a result, email users are forced to identify themselves using non-native scripts, which may result in errors due to ambiguity of transliteration (for example, иван.сергеев may become ivan.sergeev, ivan.sergeyev, or something else). Alternatively, developers of email systems must compensate for this by converting identifiers from their native scripts to ASCII scripts and back again at the user interface layer.
International email, by contrast, uses Unicode characters encoded as UTF-8—allowing for the encoding the text of addresses in most of the world's writing systems.[RFC 6530: Overview and Framework for Internationalized Email]
The following are all valid international :
([[Chinese|Chinese language]], [[Unicode]])
ಬೆಂಬಲ@ಡೇಟಾಮೇಲ್.ಭಾರತ ([[Kannada]], Unicode)
अजय@डाटा.भारत ([[Hindi|Hindi language]], Unicode)
квіточка@пошта.укр (Ukrainian, Unicode)
χρήστης@παράδειγμα.ελ ([[Greek|Greek language]], Unicode)
Dörte@Sörensen.example.com ([[German|German language]], Unicode)
коля@пример.рф ([[Russian|Russian language]], Unicode)
مثال@موقع.عر ([[Arabic|Arabic script]], Unicode)
UTF-8 headers
Although the traditional format for email header section allows non-ASCII characters to be included in the value portion of some of the header fields using MIME-encoded words (e.g. in display names or in a
Subject header field), MIME-encoding must not be used to encode other information in a header, such as an email address, or header fields like
Message-ID or
Received. Moreover, the MIME-encoding requires extra processing of the header to convert the data to and from its MIME-encoded word representation, and harms readability of a header section.
The 2012 standards RFC 6532 and RFC 6531 allow the inclusion of Unicode characters in a header content using UTF-8 encoding, and their transmission via SMTP—but in practice support is only slowly rolling out.
Interoperability via downgrading
Domain internationalization works by downgrading. UTF-8 parts, known as U-Labels, are transformed into A-Labels via an
ad-hoc method called IDNA. For example, sörensen.example.com is encoded as xn--srensen-90a.example.com. In 2003, when the need was addressed, that seemed easier than checking that all DNS software could comply with UTF-8 strings, although in theory DNS can transport binary data. This encoding is needed before issuing DNS queries.
Since traditional email standards constrain all email header values to ASCII only characters, it is possible that the presence of UTF-8 characters in email headers decreases the stability and reliability of transporting such email. This is because some email servers do not support these characters. Checking compliance with UTF-8 strings must be done software package by software package (see #Adoption below.) There was an experimental method proposed by the IETF, by which email could be somehow downgraded into the legacy all-ASCII format which all standard email servers support. This proposal was deemed too cumbersome; the meaning of the left hand side part of an email address is local to the target server, and so there is no way to check whether xn--''something'' is a valid user name, used in some domain. It was later obsoleted in 2012.
Standards framework
The set of Internet RFC documents RFC 6530, RFC 6531, RFC 6532, and RFC 6533, all of them published in February 2012, define mechanisms and protocol extensions needed to fully support internationalized email addresses. These changes include an SMTP extension and extension of email header syntax to accommodate UTF-8 data. The document set also includes discussion of key assumptions and issues in deploying fully internationalized email.
Unicode also has recommended Email Security Profiles for Identifiers.
Adoption
-
2010-10-29: PRweb Afilias and .JO Registry Bring Native Language E-mail to Arabic Internet Users
-
2013-11-14: The Bat! Email Client implemented support for Internationalized Domain Names (IDN) in email addresses.
-
2014-07-15: Postfix mailer started supporting
[[2] Postfix SMTPUTF8 support (unicode email addresses)] Internationalized Email, also known as EAI or SMTPUTF8, defined in RFC 6530 .. RFC 6533. Initial support was made available with a development version 20140715, and on 2015-02-08 ended up in a stable release 3.0.0.[[3] Postfix stable release 3.0.0] This supports UTF-8 in SMTP or LMTP sender addresses, recipient addresses, and message header values.
-
2014-07-19: XgenPlus Email Server started supporting
IDN based email, also known as support for SMTPUTF8, especially for .भारत domain.
-
2014-08-05: Google announced
[[4] A first step toward more global email] that Gmail will recognize addresses that contain accented or non-Latin characters, with more support for internationalization to follow. Their mailers (MX MTA) are announcing support for SMTP Extension for Internationalized Email (SMTPUTF8, RFC 6531).
-
2014-09-30: Message Systems announced
[[5] Message Systems Introduces Latest Version Of Momentum With New API-Driven Capabilities] that their product Momentum (versions 4.1 and 3.6.5) provides SMTPUTF8 support, the email address internationalization extension to the SMTP protocol, allowing emails to be sent to new, non-western addressed recipients.
-
2014-10-22: the version 2.10.0 of Amavis mail content filter was released
[[6] Amavis 2.10.0 released] which added support for SMTPUTF8, EAI, and IDN.
-
2016-10-18: Data Xgen Technologies launched a free linguistic email address under the name "DATAMAIL". In support of Digital India this made an Indian email app stop that supports IDN (internationalized domain name) in Hindi (हिन्दी), Gujarati (ગુજરાતી), Urdu (اردو), Punjabi (ਪੰਜਾਬੀ ਦੇ), Tamil (தமிழ்), Telugu (తెలుగు), Bengali (বাংলা), Marathi (मराठी), Latin English. DATAMAIL has launched international languages for the countries using Arabic (العَرَبِيَّة), Russian (Русский) and Chinese (汉语/漢語) as their base language.
-
2016-12-07: почта.рус Launches fully Russian (Cyrillic) email in Moscow through a press conference.
[ ]
-
2017-03-07: Apple's App Store approves publication of an iOS app with EAI support.
-
2017-12-03: Chief Minister Vasundhara Raje Raje of Rajasthan launches one Free Email service at rajasthan.in and @राजस्थान.भारत domains. Rajasthan state becomes the World's first state to provide an email address to every citizen in their own language.
-
2017-12-27: Microsoft announces coming IDN email support on Office 365
and also announces partner XgenPlus hosting IDN mailboxes.
-
2018-01-03: Microsoft Adds E-Mail Internationalization to Exchange Online.
-
2018-09-18: Courier-MTA releases support for Unicode E-mail messages, in UTF-8, for all Courier packages. In addition, Courier-IMAP uses Unicode (UTF8) for names of maildir folders.
-
2020-07-29: DataMail launched Kannada language
email address to break the language barrier
The ICANN-sponsored Universal Acceptance Working Group is working make EAI accepted in more places and publishes annual reports on acceptance.
See also
-
Internationalized domain name
-
Email Address Internationalization (EAI)
-
Unicode and email
-
IETF
-
ICANN
Bibliography
External links